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(57) Abstract 

A metfaod is provided for immobilizing a bin<fing protein capable of binding to a spcd6c compound, using xecombinaat DNA 
techniques for producing said binding protein or a functional part ihcreof. The bin<fing protein is immobilized by ptoiwiBg it as part of 
a dumcTtt protem also comprising an anchoring part derivable &om the C-tenninal part of an anchoring protein, dictcby ensuring that 
tte bmdmg protem is localized in or at the exterior of die ccU wall of the host ceU. Suitable anchoring proteins axe yeast a-agglntiain. 
FLO! (a protem associated with the floccdatioa pbeaotype in S. cerevisiae), the Major CeU Wall Protein of lower eukaxyotes, and a 
pfotemase of lactic add bacteria. For secretion the cfmncric protein can comprise a sgnal peptide inchiding those of a-maiing factor 
of yeast, o-agglutiain of yeast, invcrtase of Saccharomyces, inulicasc of Kluyvemmyces, a-amylase of BacUIus, and proteinase of lactic 
aad bactena. Also provided axe recombinant polynucleoddes encoding such chimeric protein, vectors comprising such polynucleotide, 
transfonned nuctoorgaaisms having such chimeric protein immobilized on their cell wall and a process for carrying out an isolation process 
by using snch transformed host, wherdn a medium containing said specific compound is conactcd with such host cell to foixn a complex, 
separatmg sajd complex from the medium and, optionally, releasing said specific compound from said binding protein. 
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Title: Immobilized proteins with specific binding capacities and their use in 
processes and products 

Background or the invention 
5 The pharmaceutical, the fine chemicals and the food industry need a number of 
compounds that have to be isolated from complex mutures such as extracts of 
animal or plant tissue, or fermentation broth. Often these isolation processes 
determine the price of the product. 

Conventional isolation processes are not very specific and during the isolation 
10 processes the compound to be isolated is diluted considerably with the consequence 
that expensive steps for removing water or other solvents have to be applied. 

For the isolation of some specific compounds affinity techniques are used. The 
advantage of these techniques is that the compounds bind very specifically to a 

15 certain ligand. However these ligands are quite often very expensive. 

To avoid spillage of these expensive ligands they can be linked to an insoluble 
support However, often linking the ligand is also expensive and, moreover, the 
functionality of the ligand is often affected negatively by such procedure. 
So a need exists for developing cheap processes for preparing highly effective 

20 immobilized ligands. 

Summary of the invention^ 

.The invention provides a method for immobilizing a binding protein capable of 
binding to a specific compound, comprising the use of recombinant DNA techniques 

25 for producing said binding protein or a functional part thereof still having said 
specific binding capability, said protein or said part thereof being linked to the 
' outside of a host cell, whereby said binding protein or said part thereof is localized 
in the cell wall or at the exterior of the cell wall by allowing the host cell to produce 
and secrete a chimeric protein in which said binding protein or said functional part 

30 thereof is b ound with its C-terminus to the N-terminus of an anchoring part of an 
anchoring protein capable of anchoring in the cell wall of the host celL which 
anchoring part is derivable from the C- terminal part of said anchoring protein . 

SUBSTITUTE SHEET (RULE 26) 
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Preferably, the hosi is selected from Gram-positive bacteria and fungi, which have a 
cell wall ai the outside of the host cell, in contrast to Gram-negative bacteria and 
cells of higher eukaryotes such as animal cells and plant cells, which have a 
membrane at the outside of their cells. Suitable Gram-positive bacteria comprise 
5 lactic acid bacteria and bacteria belonging to the genera Bacillus and Streptomyces, 
Suitable fungi comprise yeasts belonging to the genera Candida, Debaryomyces, Han- 
senuia, Kluyveromyces, Pichia and Saccharomyces, and moulds belonging to the 
genera Aspergillus, Penicillium and Rhizopus. In this specification the group of fungi 
comprises the group of yeasts and the group of moulds, which are also known as 

10 lower eukaryotes. In contrast to the cells in plants and animals, the group of bacteria 
and lower eukaryotes are also indicated in this spedfication as microorganisms. 
The invention also provides a recombinant polynucleotide capable of being used in a 
method as described above, such polynucleoude comprising (i) a structural gene 
encoding a binding protein or a functional part thereof still having the specific 

15 binding capability, and (ii) at least part of a gene encoding an anchoring protein 
capable of anchoring in the cell wall of a Gram-positive bacterium or a fungus, said 
part of a gene encoding at least the anchoring part of said anchoring protein, which 
anchoring part is derivable from the C-terminal part of said anchoring proteixL 
The anchoring protein can be selected from a-aggludnin, a-agglutini n, FLXDl, the 

20 Major Cell Wall Protein of a lower eukaryote, and proteinase of lactic add bacteria. 
Preferably, such polynucleotide further comprises a nucleotide sequence encoding a 
signal peptide ensuring secretion of the expression product of the polynucleotide, 
which signal peptide can be derived from a protein selected from the a-mating 
factor of yeast, a-agglutinin of yeast, invertase oi Sacdiaromyces, inulinase of 

25 Kluyveromyces, a-amyTase of Bacillus, and proteinase of lactic add bacteria. The 
polynucleotide can be operably linked to a promoter, w h[ch is preferably an 
inducible promoter. 

The invention further provides a recombinant vector comprising a polynucleotide 
according to the invention, a chimeric protein encoded by a polynucleotide 
30 according to the invention, and a host cell having a cell wall at the outside of its cell 
and containing at least one polynucleotide according to the invention. Preferably at 
least one polynucleotide is integrated in the chromosome of the host cell. Another 
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embodimeni of this pan of the invention is a host cell having a chimeric protein 
according to the invention immobilized in its cell wall and having the binding 
protein pan of the chimeric protein localized in the cell wall or at the exterior of 
the cell wall. 

5 Another embodiment of the invention is a process for carrying out an isolation 
process by using an immobilized binding protein or functional part thereof still 
capable of binding to a specific compound, wherein a medium containing said 
specific compound is contacted with a host cell according to the invention under 
conditions whereby a complex between said specific compound and said immobilized 
10 binding protein is formed, separating said complex from the medium originally 
containing said specific compound and, optionally, releasing said specific compound 
from said binding protein or functional part thereof. 

Brief description of the figures 
15 In Figure 1 the composition of pEMBL9-derived plasmid pUR4122 is indicated, die 
preparation of which is described in Example 1, 

In Figure 2 the composition of plasmid pUR2741 is indicated, which is a derivative 
of published plasmid pUR2740, see Example 1. 

In Figure 3 the composition of pEMBL9-derived plasmid pUR2968 is indicated. Its 
20 preparation is described in Example 1. 

In Figure 4 . the preparation of plasmid pUR4174 starting from plasmids pUR2741, 

pUR2968 and pUR4122.is indicated, as well as the preparation of plasmid pUR4175 
- starting from plasmids pSY16, pUR2968 and pUR4122. These preparations are 

described in Example 1. 
25 In Figure 5 the cdffiposition of plasmid pUR2743.4 is indicated. Its preparation is 

described in Example 2. 1: contains the 714 bp Pstl-XJtol fragment given in 

SEQ ID NO: 12, which fragment encodes an scFv-TRAS fragment of anti-traseolide® 

antibody 02/01/01. 

In Figure 6 the composition of plasmid pUR4178 is indicated. Its preparation is 
30 indicated in Example 2. It contains the above mentioned 714 hpFstl-Xhol fragment 
given in SEQ ID NO: 12. This plasmid is suitable for the expression of a fusion 
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protein between scFv-TR AS and aAGG preceded by the invertase signal sequence 

(sua). 

In Figure 7 the composition of plasmid pUR4179 is indicated. Its preparation is 
indicated in Example 2. It contains the above mentioned 714 bp Pstl-Xhol fragment 
5 given in SEQ ID NO: 12. This plasmid is suitable for the expression of a fusion 
protein between scFv-TRAS and aAGG preceded by the prepro-c-mating factor 
signal signal sequence. 

In Figure 8 a molecular design picture is given, showing the musk odour molecule 

traseolide® and a modified musk antigen, described in Example 3. 
10 In Figure 9 the composition of plasmid pUR4177 is indicated. Its construction is 

described in* Example 4. Plasmid pUR4177 contiains the 734 bp EaghXfiol DNA 
.. .... ^^^S^^^^ given in SEQ ID NO: 13 encoding the variable regions of the heavy and 

light chain fragments from the monoclonal antibody directed against the human 

chorionic gonadotropin (an scFv-HCG fragment) and is a 2 jtm-based vector 
15 suitable for producuon of the chimeric scFv HCG-aAGG fusion protein preceded by 

the invertase signal sequence and under the control of the GAL7 promoter. 

In Figure 10 the composition of plasmid pUR4180 is indicated. Its preparation is 

indicated in Example 4. It contains the above mentioned 734 bp Eagl-Xhol DNA 
. fragment given in SEQ ID NO: 13 and is a 2 jim-based vector suitable for 
20 production of the chimeric scFv-HCG-aAGG fusion protein preceded by the prepro- 

a-mating factor signal sequence and under the control of the GAL? promoter. 

In Figure 31 the composition of plasmid pUR2990. a 2 jim-based veaor, is 

indicated, which is suggested in Example 5 as a starting vector for the preparation of. 

plasmid pUR4196 (see Figure 12). Plasmid pUR2990 contams a DNA fragment 
25 encoding a chimeffc lipase-FLOl protein that will be anchored in the cell wall of a 

lower eukaryote and can catah'ze lipid hydrolysis. 

In Figure 12 the composition of plasmid pUR4196 is indicated. Its preparation is 
explained in Example 5. It contains a DNA fragment encoding a chimeric protein 
comprising the s cFv^HCG followed by the C-terminal part of the FLOl-protein, and 
30 is a vector suitable for the production of a chimeric protein anchored in the cell wall 
of the host organism and can bind HCG. 
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In Figure 13 the composition of plasmid pUR2985 is indicated. Its preparation is 
described in Example 6. ii contains a c/ioB gene coding for the mature part of the 
cholesterol oxidase (EC 1.1.3.6) obtained via PGR techniques from the chromosome 
of Brevibactenum steroUcum, 
5 In Figure 14 the composition of pla.smid pUR2987 is indicated. Its preparation from 
plasmid pUR2985 is described in Example 6. It contains a DNA sequence 
comprising the c/ioB gene coding for the mature pari of the cholesterol oxidase 
preceded by DNA encoding the prepro- a -mating factor signal sequence and 
followed by DNA encoding the C-terminal part of a-agglutinin. 
10 In Figure 15 the composition of the published plasmid pGKV550 is indicated. It is 
described in Example 7 and contains the complete cell wall proteinase operon of 

« 

JLactococcus lactis subsp. crenioris Wg2, including the promoter, the ribosomc 
binding site and the pnP gene. 

In Figure 16 the composition of plasmid pUR2988 is indicated. Its preparation is 
15 described in Example 7. It is anticipated that this plasmid can be tised for preparing 
a further plasmid pUR2989, which after introduction in a lactic acid bacterium will 
be responsible for producing a chimeric protein that will be anchored at the outer 
surface of the lactic acid bacterium and is capable of binding cholesterol 
In Figure 17 the composition of plasmid pUR2993 is indicated. Its preparation is 
described in Example 8. It is anticipated diat this plasmid can be used for 
transforming yeast cells that can bind a human epidermal growth factor (EGF) 
through an anchored chimeric protein containing an EGF receptor. 
In Figure 18 the composition of plasmids plJR4482 and 4483 is indicated. Their 
preparation is described in Example 9. Plasmid pUR4482 is a yeast episomai 
expression plasmid for expression of a fusion protein with the invenase signal 
sequence, the CHv09 variable region, the Myc-tail. and the "X-P-X-P" Hinge region 
of a camel antibody, and the a-agglutinin cell wall anchor region. Plasmid pUR4483 
differs from pUR4482 in that it does not contain the "X-P-X-P" Hinge region. 
In Figure 19 immunofluorescent labelling (anti-Myc antibody) of SUlO cells in the 
exponential phase (0053^= 0.5) expressing the genes of camel antibodies present on 
plasmids pUR4424, pUR4482 and pUR4483 is shown. 
Ph = phase contrast, Fl = fluorescence. 
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In Figure 20 immunofluorescent labelling (anti-human IgG antibody) of SUlO cells 
in the exponential phase (00^30=0.5) expressing the genes of camel antibodies 
present on plasmids pUR4424, pUR4482 and pUR4483 is shown. 
Ph = phase contrast, Fi = fluorescence. 



Abbreviations used in the Fibres : 



a-gal: 

AG-alpha-l/AGal: 
AGal cds/a-AGG: 
10 Amp/amp r: 
CHv09: 

fl: 

FLOl/FLO (C-part): 

15 

Hinge: 
LEU2: 

LEU2d/Leu2d: 
• Leu 2d cs: 
20 MycT: 
OriMBl: 
Pgal7/pGAL7: 
Tpgk: 

ppa-MF/MFalss: 
25 repA: 

ScFv(Vh-Vl): 
ss: 

SUC2: 
30 2u/2 micron: 



gene encoding guar a-galactosidase 

gene expressing a-agglutinin from 5. cerevisiae 

coding sequence of a-agglutinin 

B-lactamase resistance gene 

camel heavy chain variable 09 fragment 

eiythromycin resistance gene 

phage f 1 replication sequence 

C-terminal part of FLOl coding sequence of fiocculation 
protein 

Camel "X-P-X-P" Hinge region, sec Example 9 
LEU2 gene 
truncated LEU2 gene 
coding sequence LEU2d gene 
camel Myc-tail 

origin of replication MBl derived from K colt plasmid 
GAL7 promoter 

terminator of the phosphoglycerateldnase gene 

prepro-part of a-mating factor (= signal sequence) 

gene encoding the repA protein required for replication (Fig. 

15/16). 

single chain antibody fragment containing V^^ and Vl chains 
signal sequence 
invenase signal sequence 
2^m sequence 
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Detailed description of the invention 

The present invention relates to the isolaiion of valuable compounds from complex 
mixtures by making use of immobilized ligands. The immobilized ligands can be 
proteins obtainable via genetic engineering and can consist of two parts, namely 
5 both an anchoring protein or functional part thereof and a hindinpr pr^rpin or 
functional part thereof. 

The anchoring protein sticks into cell walls of microorganisms, preferably lower 
eukaryoies, e.g. yeasts and moulds. Often this type of proteins has a long C-terrainal 
10 part that anchors it in the cell wall. These C-terminal parts have very special amino 
acid sequences. A typical example is andioring via C-terminal sequences of proteins 
enriched in proline, see Kok (1990). 

The C-terminal part of these anchoring proteins can contain a substantial number of 
potential serine and threonine glycosylation sites. O-glycosylation of these sites gives 

15 a rod-like conformation to the C-terminal part of these proteins. 

In the case of anchored manno-proteins they seem to be linked to the glucan in the 
cell wall of lower eukaryotes, as they cannot be extracted from the cell wall with 
sodium dodecyl sulphate (SDS), but can be liberated by glucanase treatment, see 
. our co-pending patent application WO-94/01567 (UNILEVER) published 20 January 

20 1994 and Schreuder cs. (1993). both being published after the claimed priority date. 
Another mechanism to anchor proteins at the outer side of a cell is to make use of 
the properly that a protein containing -a glycosyl-phosphatidyl-inositol (GPI) group 
anchors via this GPI group to the cell surface, sec Conzelmann ca (1990), 

25 The binding protein is so called, because it ligates or binds to the specific compound 
to be isolated. If the N-terminal part of the anchoring protein is sufficiently capable 
of binding to a specific compound, the anchoring protein itself can be used in a. 
process for isolating that specific compound. Suitable examples of a binding protein 
comprise an antibody, an antibody fragment, a combination of antibody fragments, a 

30 receptor protein, an inactivated enzyme still capable of binding the corresponding 
substrate, and a peptide obtained via Applied Molecular Evolution, see Lewin 
(1990), as well as a part of any of these proteinaceous substances still capable of 
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binding to ihc specific compound to be isolated. AJl these binding proteins are 
characterized by specific recogniiion of the compounds or group of related 
compounds to be isolated. The binding rate and release rate, and therefore the 
binding constant between the specific compound to be isolated and the binding 
5 protein, can be regulated either by changing the composition of the liquid extract in 
which the compound is present or, preferably, by changing the binding protein by 
protein engineering. 

The gene codin g for the chimeric protein comprising both the binding protein and 
10 the anchoring protein (or functional parts thereoQ can be placed under control of a 
constitutive, inducible or derepressible promoter and will generally be preceded by a 
DNA fragment encoding a signal sequence ensuring efficient secretion of the 
chimeric protein. Upon secretion the chimeric protein will be anchored in the cell 
wall of the microorganisms, thereby covering the surface of the microorganisms with 

15 the chimeric proteia These microorganisms can be obtained in normal fermentation 
processes and their isolation is a cheap process, when physical separation processes 
are used, e.g. centrifugation or membrane filtration. 
After washing, the isolated microorganisms can be added to liquid extracts 
containing the valuable specific compound or compounds. After some time the 

20 equilibrium between the bound and firee specific compound(s) will be reached and 
the microorganisms to which the specific compound or group of related compounds 
is bound can be separated from the extract by simple physical techniques. 
- Alternatively, the microorganisms covered with ligands can be brought on a support 
material and subsequently this coated support material can be used in a column. 

25 The liquid extract'containing the specific compound or compounds of interest can be 
added to the column and afterwards the compound(s) can be released from the 
; ligand by changing the composition of the eluting liquid or the temperature or both. 
A skilled person will recognize that in addition to these two possibilities other 
modifications can be used for effecting the binding of the specific compound and the 

30- ligand, their sub.sequent isolation and/or the release of the specific compound(s). 
In particular the inveniioii relates lo chimeric proteins that are bound to the cell 
wall of lower eukaryotes. Suitable lower eukaryotes comprise yeasts. e.g. Candida, 
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Debaryomyces, Hansenuht, Kluweromyccs, Pidiia and Saccfiaromyces, and moulds e.g. 
Aspergillus, Penicillium and Rhizopus, For some applications prokaryotes are also 
applicable, especially Gram-positive bacteria, examples of which include lactic acid 
bacteria, and bacteria belonging to the genera Bacillus and Streptomyces. 

5 

For lower eukarvotef; the present invention provides genes encoding chimeric 
proteins consisting of: 

a. a DNA sequence encoding a signal sequence funaional in a lower eukaiyotic 
host, e.g. derived from a yeast protein including the a-mating factor,invertase, 

10 a-agglutinin, inulinase or derived from a mould protein e.g. xylanase; 

b. a structural gene encoding a C-terminal part of a cell wall protein preceded by a 
. - structural gene encoding a protein, that is capable of binding to the specific 

compound or group of compounds of interest, examples of which include 

- an antibody, 

15 - a single chain antibody fragment (scFv; see Bird and Webb Walker (1991), 

- a variable region of the heavy chain (Vh) or a variable region of the light chain 
(V J of an antibody or that part of such variable region still containing one to 
three of the complementarity determining regions (CDRs), 

- an agonist-recogm'zing part of a receptor protein or a part thereof still capable 
20 of binding the agonist, 

- a catalytically inactivated enzyme, or a fragment of such enzyme still containing 
a substrate binding site of the enzyme, 

- specific lipid binding proteins or parts of these proteins still containing the lipid 
binding site(s), see Ossendorp (1992), and 

25 - a peptide that has been obtained via Applied Molecular Evolution, see Lewn 
(1990). 

All expression products of these genes are characterized in that they consists of.a 
signal sequence and both a protein part, that is capable of binding to the 
compound(s) to be isolated, and a C-terminus of a typically cell wall bound protein, 
30 examples of the latter including a-agglutinin, see Lipke c.s. (1989), a-agglutinin, see 
Roy c.s. (1991), FLOl (sec Example 5 and SEQ ID NO: 14) and the Major Cell 
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Wail Protein of lower cukaryoies, which C-tcrminus is capable of anchoring the 
expression product in the cell wall of the lower eukaryote host organism. 
The expression of these genes encoding chimeric proteins can be under control of a 
constitutive promoter, but an inducible promoter is preferred, suitable exarnples of 
5 which include the GAL7 promoter from Saccharomyces, the inulinase promoter from 
Kluyveromyces^ the methanol-oxidase promoter from Hansenula, and the xylanase 
promoter Aspergillus, Preferably the constructs are made in such a way that the 
new generic information is integrated in a stable way in the chromosome of the host 
cell, see e.g. WO.9I/00920 (UNILEVER). 
10 The lower eukaryotes transformed with the above mentioned genes can be grown in 
normal ferme'niation, conrinous fermentation, or fed batch fermentadon processes. 
The selection of a suitable process for growing the microorganism will depend on 
the construcuon of the gene and the promoter used, and on the desired purity of the 
cells after the physical separation procedure(s). 

15 

For bacteria the present invention deals with genes encoding chimeric proteins 
consisting of: 

a. a DNA sequence encoding a signal sequence functional in the specific bacterium, 
e.g, derived from a Bacillus a-amylase, a Bacillus subtilis subrilisin, or a 

20 Lactococcus lactis subsp. cremoris proteinase; 

b. a structural gene encoding a C-terminal part of a cell wall protein preceded by a 
structural gene encodmg a protein capable of binding to the specific compound or 
group of compounds of interest, examples of which are given above for a lower 
eukaryote. 

25 All expression products of these genes are characterized in that they consist of a 
signal sequence and both a protein part, that is capable of binding to the specific 
compound or specific group of compounds to be isolated, and a C-terminus of a 
typically cell wall-bound protein such as the proteinase oi Lactococcus lactis subsp. 
cremoris strain Wo2, see Kok c,s. (1988) and Kok (1990), the C-terminus of which is 

30 capable of anchoring the expression product in the cell wall of the host bacterium. 
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The invention is illustrated with the following Examples without being limited 
thereto. First the endonucleasc restriction sites mentioned in the Examples are 
given. 
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BstEII G GTNACC 
CCANTG G 



Clal 



JJcoRI 



Notl 



15 Sad 



G AATTC 
CTTAA G 

GC GGCCGC 
CGCCGG CG 

GAG.CT C 
C TCGAG 



AT CGAT 
TAGC TA 



HlndllX A AGCTT 
TTCGA A 



Nrul 



Sail 



TCG CGA 
AGC GOT 

G TCGAC 
CAGCT G 



Eagl 



Nhel 



Pstl 



Xhol 



C GGCCG 
GCCGG C 

G CTAGC 
CGATC G 

CTGCA G 
G ACGTC 

C TCGAG 
GAGCT C 



Example 1. Construction of a gene encoding a chimeric protein that will be 

20 anchored in the cell wall of a lower eukaryote and is able to bind 

with high specificity lyso:^me from a complex mixture. 
Lysozyme is an anti-microbial en^^ne with a number of applications in the 
pharmaceutical and food industries. Several sources of lyso^me are known, e.g. egg 
yolk or a fermentation broth containing a microorganism producing lyso^me. 

25 Monoclonal antibodies have been raised against lysozyme, see Ward cs, (1989), and 
the mRNA'S' encoding the light and heavy chains of such antibodies have been 
isolated from the hybridoma cells and used as template for the synthesis of cDNA 
' using reverse transcriptase. Starting from the plasmids as described by Ward cs. 
(1989), we constructed a pEMBL-derived plasmid. designated pUR4122, in which 

30 the multiple cloning site of the pEMBL-vector, ranging from the EcoRI to the 

Hindlll site, was replaced by a 231 bp DNA fragment, whose nucleotide sequence is 
given in SEQ ID NO: 1 and has an £coRI site (GAATTC) at nucleotides 1-6, a Pstl 
site (CTTGCAG) at nucleotides 105-310, a BstEll site (GGTCACC) at nucleotides 
122-128, a Xhol site (CTCGAG) at nucleotides 207-212, and a //mdlll site 

35 (AAGCTT) at nucleotides 226-233. 
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Construction of pUR4122 

Plasmid pEMBL9, see Ucnie c.s. (1983), was digested with £coRI and Hiadlll and 
the resulting large fragment was hgaied with the double stranded synthetic DNA 
fragment given in SEO ID NO: 1. For the successive ligation of DNA fragments, 
5 which finally form the coding sequence of a single chain antibody fragment for 
lysozyme, the following elements were combined in the 231 bp DNA fragment (SEQ 
ID NO: 1) inserted into rhe pEMBL-9 vector: the 3' part of the GAL7 promoter, 
the invertase signal sequence (SUC2), a Fstl resiriaion site, a BstEll restriction site, 
a sequence encoding the (GGGGS)x3 peptide linker connecung the V„and frag- 
10 ments, a Sad restriction site, a Xliol restriction site and a HindlU restriction site, 
resulting in plasmid pUR41 19. To obtain the in frame fusion between and the 
. GGGGS-linker plasmid pSWl-VHD1.3-VKD13-TAGl, see Ward cs. (1989), was 

• digested with Fstl and BstEU and a DNA fragment of 0.35 kbp was ligated in the 
correspondingly digested pUR4119 resulting in plasmid pUR4119A. Subsequently 

15 the pla.smid pSWl-VHD1.3-VKDl3-TAG 1 was digested with Sad and XIiol and 
this fragment containing the coding part of was finally ligated into the Sacl/Xhol 
sites of pUR4119A, resulting in plasmid pUR4122 (see Figure 1), 

Construction of pUR4174. see Figure 4 
20 To obtain 5. cerevistae episomal expression plasmids containing DNA encoding a cell 
wall anchor derived from the C-terminal part of a-agglutinin, "plasmid pUR2741 (see 
Figure 2) was selected as" starting vector. Basically, this plasmid is a derivative of 

* pUR2740, which is a derivative of plasmid pUR2730 as described in WO-91/19782 
(UNILEVER) and by Verbakel (1991). The preparation of pUR2730 is clearly 

25 described in Example 9 of EP-Al-0255153 (UNILEVER). Plasmid pUR2741 differs 
from plasmid pUR2740 in that the EagJ restriction site within the remaining part of 
the already inactive tei resistance gene was deleted through Nrul/Sall digestion. The 
Sail site was filled in prior to religaiion, 

30 After digesnng pUR4122 with Sad (partially) and Hindlll the approximately 800 bp 
fragment was isolated and cloned into the pUR2741 vector fragment, which was 
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obtained after digestion of pUR2741 with the same en^mcs. The resulting plasniid 
was named pUR4125. 

A plasmid named pUR29rxS (see Figure 3) was made by (1) digesting with Hindlll 
the >l^ff7-containing plasmid pLa21 published by Lipke c.s. (1989), (2) isolating an 
5 about 6.1 icbp fragment and (3) iigating that fragment with //mdlll-treated pEMBL9. 
so that the 6.1 kbp fragment was introduced into the HindUl site present in the 
multiple cloning site of the pEMBL9 vector. 

Plasmid pUR4125 was digested mihXIiol and Hindlll and the about 8 kbp 
fragment was ligated with the approximately 1.4 kbp Nhel-HindlU fragment of 
10 pUR2968, using Xhol/Nhel adapters having the following sequence: 
Xhol Nhel 

5 • - TC GAG . ATC AAA GGC GGA TCT G -3 • = SEQ ID NO: 2 

• ' 31- c TAG TTT CCG CCT AGA CGATC -S ' = SEQ ID NO: 3. 

The plasmid resulting from the ligation of the appropriate parts of plasmids 
15 pUR2968, pUR4125 and Xhol/Nhel adapters, was designated pUR4174 and encodes 
a chimeric fusion protein at the amino terminus consisting of the invertase signal 
(pre) peptide, followed by the scFv-LYS polypeptide and, finally, the C-terminal part 
of a -agglutinin (see Figure 4). 

20 Construction of pUR4175. see Figure 4 

Upon digesting pUR4122 (see above) with Pstl and Mndlll, the approximately 
700 bp fragment was isolated and ligated into a vector fragment of plasmid pSY16, 
see Harmsen cs. (1993), which was digested with Eagl and Hindlll and using 
. Eagl'Pstl adapters, having the following sequence: 

25 Eagl — Pstl 

5 * - G GCC G CC CAG GTG CAG CTG CA -3 * = SEQ ID NO: 4 

3 • - CGG GTC CAC GTC G -5 * = SEQ ID NO: 5 

' The resulting plasmid, named pUR4132, was digested with Xliol and Hindlll and 
ligated with the approximately 1.4 kbp NhehHindlll fragment of pUR2968 (see 
30 above), using XIiol/Mliel adapters as described above, resulting in pUR4175 (see 
Figure 4). This plasmid contains a gene encoding a chimeric protein consisting of 
the a-mating factor prepro-peptide, followed by the scFv-LYS polypeptide and, 
finally, the C-terminal part of a-agglutinin. 



14 



Example 2. Construc(ion or genes encoding n scries of homologous chimeric 

proicins thai will be anchored in the cell wall of a lower eukaiyote 
and arc able to bind with high specificities the musk fragrance 
trascolidc® from a complex mixture. 
5 The isolation of RNA from the hybridoma cell lines, the preparation of cDNA and 

amplification of gene fragments encoding the variable regions of antibodies by PGR 

was performed according to standard pr(jcedures known from the literature, see e.g. 

Orlandi cs. (1989), For the PGR amplification different oligonucleotide primers 

have been used. 

10 For the heavy chain fragment: 

A: AGG TSM AR C TGC AG S AGT CWG G = SEQ ID NO: 6 

Pstl 

" ■ in which S is C or G, M is A or C» R is A or G, and W is A or T, 
and 

15 B: TGA GGA GAG GGTGACCGT GGT GCC TTG GCC GG 

BstEU = SEQ ID NO: 7. 

For the light chain fragment (Kappa): 

C: GAG ATT GAG CTC AGG GAG TCT GGA = SEQ ID NO: 8, 

Sacl 

20 and 

D: GTT TGA TCT GGA G CT TGG TGG G = SEQ ED NO: 9. 

Coastruction of pUR4143 
25 To simplify future construction work an Eagl restriction site was introduced in 

pUR4122 (see above), at the junction between the invertase signal sequence and the 

scFv-LYS. This was achieved by replacing the about 110 bp EcoRl-Pstl fragment 
. within the synthetic fragment given in SEQ ID NO: 1 by synthetic adapters with the 

following sequence: 
30 EcoR] Pstl 

AATTCGGCCGTTCAGGTGCAG CTGCA = SEQ ID NO: 10 

GCCGGCAAGTCCACGTCG = SEQ ID NO: 11. 
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The resulting plasmid was designated pUR4122.1: a construction vector for single 
chain Fv assembly in frame behind an £^^'1 site for expression behind either the 
prepro-a-mating factor sequence or the SUC2 invertase signal sequence. 
After digesting the heavy chain PGR fragment with Fstl and BsiEW, two fragments 
5 were obtained: a Fst\ fragment of about 230 bp and a Pstl/BsiEU fragment of about 
110 bp. The latter fragment was cloned into vector pUR4 122.1, which was digested 
with Fstl and BstElL The newly obtained plasmid (pUR41222) was digested with 
5acl and Xhoh after which the light chain PGR fragment (digested with the same 
restriction enzymes) was cloned into the vector, resulting in pUR4 122.3. This 

10 plasmid was digested with Fstl, after which the above described about 230 bp Fstl 
fragment was cloned into the plasmid vector, resulting in a plasmid called pUR4143. 
Two orientations are possible, but selection can be made by restriction analysis, as 
usual. Instead of the scFv-LYS gene originally present in pUR4122, this new plasmid 
pUR4143 contains a gene encoding an scFv-TRAS fragment of anti-traseolide 

15 antibody 02/01/01 (for the nucleotide sequence of the 714 bp Fstl-Xhol fragment 
see SEQ ID NO: 12). 

Construction of pUR4178 and pUR4179. 

After digesting pUR4143 with Eagl and with HindiU, an about 715 bp fragment can 
20 be isolated. Subsequentely, this fragment can be cloned into the vector backbone 
fragments of pUR2741 and pUR4175, that were digested with the same restriction 
enzymes. In the case of |5UR2741, this resulted in plasmid pUR2743.4 (see Figure 
5). This plasmid can subsequently be cleaved with Xhol and Hindlll and ligated wi± 
the about 8 kbp Z//oI-//mdIII fragment of pUR4174, resulting in pUR4178 (see 
25 Figure 6). 

In the situation where pUR4175 was used as a starting vector, the resulting plasmid 
was designated pUR4179 (see Figure 7). 

Both plasmids, pUR4178 and pUR4179 were introduced into 5. cerevisiae. 

30 
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Example 3. The modificaiion of the binding parts of the chimeric protein that 

can bind traseoiidc^ in order to improve the binding or release of 
traseolidc® under certain conditions. 

Modification of binding properties of antibodies during the immune response is a 
5 well known immunological phenomenon originating from the fine tuning of 

complementarity determining sequences in the antibod/s binding region to the 

antigen's molecular properties. This phenomenon can be mimicked in vitro by 

adjusting the antigen binding regions of antibody fragments based on molecular 

models of these regions in contact with the antigen. 
10 One such example consists of protein engineering the antimusk antibody M02/01/01 

to a stronger binding variant M020501I. 

,Jirst, a molecular model of M02/01/01 variable fragment (Fv) was constructed by 
homology modelling, using the coordinates of the anti-lysozyme antibody HYHEI^ 
10 as a template (Brookhaven Protein Data Bank entry: 3HFM). This model was 

15 refined using Molecular Mechanics and Molecular Dynamics methods from within 
the Biosym program DISCOVER, on a Silicon Graphics 4D240 workstation. 
Secondly, the binding site of the resulting Fv was mapped by visually docking the 
musk antigen into the COR region, followed by a refinement using molecular 
djoiamics again. Upon inspection of the resulting model for packing efficiency (van 

20 der Waals contact areas), it was concluded that substitution of ALA H96 by VAL 
would increase the (hydrophobic) contact area between the ligand and Fv, and 
consequently lead to a stronger interaction (see Figure 8). 
■ When this mutation is introduced into M02/01/01, the cDNA-derived scFv from 
Example 2, the result will be Fv MG20501i; a variant with an increased affmity of at 

25 least a factor of 5 can be expected, and the increased affinity could be measured 
using fluorescence titration of the Fv with the musk odour molecule. 
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Example 4. Construction of a gene encoding a chimeric protein that will be 

anchored in the cell wall of lower eukaryote and is able to bind 
hormones such as HOC. 

Gene fragments, encoding the variable regions of the heavy and h'ghi chain 
5 fragments from the monoclonal antibody directed against the human chorionic 

gonadotropin were obtained from a hybridoma cell line in a similar way as described 

in Example 2. 

Subsequently, these HCG V„ and V,, gene fragments were cloned into plasmid 
pUR4143 by replacing the corresponding /•.y/I-^j/EII and Sad- Xhol gene fragments, 

10 resulting in plasmid pUR4146. 

Similar to the method described in Example 2, the 734 bp EaghXhol fragment 
- (nucleotide sequence given in SEO JD NO: 13) encoding the variable regions of the 
heavy and light chain fragments from the monoclonal antibody directed against the 
human chorionic gonadotropin (an scFv-HCG fragment) was isolated from pUR4146 

15 and was introduced into the vector backbone fragment of pUR4178 (see Example 2) 
and will be introduced into the vector backbone fragment of pUR4175 (see Example 
1), both digested with the same restriction enzymes. The resulting plasraids 
pUR4177 (see Figure 9) was, and pUR4180 (see Figure 10) will be, introduced into 
S. cerevtsiae strain SUIO. 

20 

•I 

Example 5. Construction of a gene encoding a chimeric scFv-FLOl protein that 

will be anchored in the cell wall of lower eukaryote and is able to 
bind hormones such as HCG. 
25 One of the genes associated with the flocculation phenotype in 5. cerevisiae is the 
, FLOl gene. The DNA sequence of a clone containing major parts of the FLOl gene 
has been determined, see SEQ ID NO: 14 giving 2685 bp of the FLOl gene. The 
cloned fragment appeared to be approximately 2 kb shorter than the genomic copy 
as judged from Southern and Northern hybridizations, but encloses both ends of the 
30 FLOl gene. Analy.sis of the DNA sequence data indicates that the putative protein 
contains at the N-terminus a hydrophobic region which confirms a signal sequence 
for secretion, a hydrophobic C-terminus that might function as a signal for the 



attachment of a GPI-anchor and many glycosylation sites, especially in the 
C-terminiis, with 46.6% serine and threonine in the arbitrarily defined C-ierminus 
(aa 271-894). Hence, it is likely that the FLOl gene product is located in an 
orientated fashion in the yeast cell wall and may be directly involved in the process 
5 of interaction with neighbouring cells. 

The cloned FLOl sequence might therefore be suitable for the immobilization of 
proteins or peptides on the cell .surface by a different type of cell wall anchor. 
For the production of a chimeric protein comprising the scFv-HCG followed by the 
C-terminal part of the FLOl -protein, plasmid pUR2990 (see Figure 11) can be used 

10 as a starting vector. The preparation of episomal plasmid pUR2990 was described in 
our co-pending patent application WO-94/0J567 (UNILEVER) published on 20 
January 1994, Le. during the priority year. Plasmid pUR2990 comprises the chimeric 
gene consisting of the gene encoding the Humicola lipase and a gene encoding the 
putative C-termtnal cell wall anchor domain of the FLOl gene product, the chimeric 

15 gene being preceded by the invertase signal sequence (SUC2) and the GAL7 
promoter; further the plasmid comprises the yeast 2 jira sequence, the defective 
Leu2 promoter described by Eckard and Hollenberg (1983), and the Leu2 gene, see 
Roy c*s. (1991). Plasmid pUR4146, described in Example 4, can be digested with 
Psti and Xhoi^ and the about 0.7 kbp PsthX/wl fragment containing the scFv-HCG 

20 coding sequence can be isolated. For the in frame fusion of this DNA sequence 
between the'C-terminal FLOl part and the SUC2 signal sequence, the fragment can 
be directly ligated with the 93 kbp Eagl/Nhel (partial) backbone of plasmid 
pUR2990, resulting in plasmid pUR4196 (see Figure 12). This plasmid will comprise 
an additional triplet encoding Ala at the transition between the SUC2 signal 

25 sequence and the start of the scFv-HCG, and a E-I-K-G-G amino acid sequence in 
front of the first amino acid (Ser) of the C part of FLOl protein. 

If in the previous Examples 1-5 the level of exposed antibody fragments is too low, 
the production level can be increased by mutagenesis of the frame work regions of 
30 the antibody fragment. This can be done in a site directed way or by (targeted) 
random mutagenesis, using techniques described in the literature. 
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Example 6. Construction of a gene encoding a chimeric protein that will be 

anchored in the cell wall of a lower eukaryote and is able to bind 
choIcslcroL 

In the literature two DNA sequences for cholesterol oxidase* are described, the choB 
5 gene from Brevibaaerium .sicro/iawi, see Ohta c.s. (1991) and the choA gene from 
Slreptomyces sp, SA-COO. see Ishl2aka c,s. (1989). For the construction of a DNA 
fusion between the choB gene coding for cholesterol oxidase (EC 1.13.6) and the 
3' part of the AG-al gene, the PCR technique on chromosomal DNA can be 
applied. Chromosomal DNA can be isolated by standard techniques from 
10 Brevibacieriuiii stcroUamu and the DNA part coding for the mature part of the 
cholesterol oxidase can be amplified through application with the following 
. ^;;orresponding PCR primers choOlpcr and cho02pcr: 

choOlpcr 

15 5'- GCC CCC AGC CCC ACC CTC C-3 ' = SEQ ID NO: 16 

3*- CGG GGG TCG GCG TGG GAG C-5 * = SEQ ID NO: 17 

I I I lit III III III 111 I 

S * -AGATCTGAATTCGCGGCC g£C CCC AGC CGC ACC CTC G-3 ' = SEQ ID NO: 18 

EcoRI VotX 
20 Eagl 

cho02pcr 

Vhel HindlZl 

3 '-TAG TAG AGC AGG CTG TAG GTC CGATCGACTTTCGAATCTAGA-5 * = SEQ ID NO: 19 
OS I I I I < I r I I I I I t I I I I I I I I 

^ I » « III III III t t I III III 

5 • -ATC ATC TCG TCC GAC ATC CAG-3 ' = SEQ ID NO: 20 

3* -TAG TAG AGC AGG CTG TAG GTC-5* = SEQ ID NO: 21 

Both primers can specifically hybridize with the target sequence, thereby amplifying 
■ the coding part of the gene in such a way, that the specific PCR product -after 
30 Proteinase K treatment and digestion with £coRl and HMllV can be directly 
cloned into a suitable vector, here preferably pT219R, see Mead c.s. (1986). This 
will result in plasmid pUR29S5 (see Figure 13). 

In addition to the already mentioned restriction sites both PCR primers generate 
other restriction sites at the 5* end and the 3* end of the 15 kbp DNA fragment, 
35 which can be used later on to fuse the fragment in frame between either the SUC2 
signal sequence or the prep ro- a -mating factor signal sequence on one side and the 
C-terminus coding part of the a-agglutinin gene on the other side. To facilitate the 
ligation behind the prepro-MF sequence a No(\ site is introduced at the 5' end of 
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PGR oligonucleotrde clioOlpcr, allowing for example, the exchange of ihe 731 bp 
Ea^l/Nliel fragmern coniaining ihe scFv-Lys coding sequence in pUR4175 for the 
choB coding sequence. 

To create an enzymatically inactive fusion protein between cholesierol oxidase and 
5 a-agglutinin, the above described subcloning into pTZ]9R can be used. Cholesterol 
oxidase is an FAD-dcpcndcnt cnr\'mc for which the crystal structure of the 
Brevihaaerum steroUcum enzyme has been determined, see Vrielink c«s. (1991). The 
enzyme displays homology with the typical pattern of the FAD-binding domain with 
the Gly-X-Gly-X-X-Gly sequence near the N-ierminus (amino acid 18-23). Site- 

10 directed in vitro mutagenesis on the plasmid pUR2985 according to the 

manufacturer's protocol (Muta-Gene kit, Bio-Rad) can be applied to inactivate the 
FAD-binding site through replacing the triplet(s) encoding the Gly residue(s) by 
triplets encoding other amino acids, thereby presumably inactivating the enzyme. 
£.g. the following primer can be used for site-direaed mutagenesis of 2 of the 

15 conserved Gly residues. 

pr 3'- CGG GAG CAG TAG CGG TCA CGT ATG CCG CCA CGG CAG CGG CGC -S ' 
ill III III III I I III I I III III III til til III III 
til III 1(1 III t I III I t III til III III III I I I I I I 

CS 5*- GCC CTC GTC ATC GGC AGT GGA TAC GGC GGT GCC GTC GCC GCG -:3' 
20 Ala Gly Gly Gly Gly Ala Ala Ala 

I I 
Ala Ala 

pr = primer = SEQ ID NO: 22 

cs = coding strand = SEQ ID NO: 23 

25 

As a result of the mutagenesis with the described primer, plasmid pUR2986 will be 
obtained. From this plaimid the DNA coding for the presumably inactivated 
cholesterol oxidase can be released as a 1527 bp fragment through Notl/Nhel 
digestion, and subsequently directly used to exchange the scFv-Lys coding sequence 

30 in pUR4175, thereby generating plasmid pUR2987 (see Figure 14). To obtain a 
variant yea.st secretion vector, where the secretion is directed through the SUC2 
signal sequence, for example the 3823 bp long Sacl/Nhel segment of plasmid 
pUR2986 can be used to replace the Sacl/Nhel fragment in pUR4174. 
This inactivation of the FAD-binding site might be preferable over other mutations, 

35 since an unchanged active centre can be expected to leave the binding properties of 
cholesterol oxida.se for cholesierol unaltered. Instead of the described Gly-Ala 
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exchanges at position 18 and 20 of the mature coding sequence, every other suitable 
amino acid change can also be performed. 

To inactivate the enzyme, site directed mutagenesis can be optionally immediately 
performed in the active site cavity, for example through exchange of the Glu33l, a 
residue appropriately positioned to act as the proton acceptor, thus generating a new 
variant of an immobilized, enzymatically inactive fusion protein. 



Example 7. Construction of a gene encoding a chimeric protein that will be 
10 anchored in the cell wall of a lactic acid bacterium and is able to 

bind cholesterol. 

: It has been described that proteinase oi Lactococcus lactis subsp. crentoris is 
anchored to the cell wall through iLs 127 amino acid long C-terminal, see Kok cs. 
(1983) and Kok (1990). In a way similar to that described in Example 6, the 

15 cholesterol oxidase of Brevibacterium steroliaun (choB) can be immobilized on the 
surface of Lactococcus lactis. Fusions can be made can be made between the dioB 
structural gene and the N-terminal signal sequence and the C-terminal anchor of the 
proteinase of Lactococcus lactis, Plasmid pGKV550 (see Figure 15) contains the 
complete proteinase operon of Lactococcus lactis subsp. cremoHs Wg2, including the 

20 promoter, a ribosome binding site and DNA fragments encoding the already 
menuoned signal and anchor sequences, see Kok (1990). First a DNA fragment, 
containing the main part of the signal sequence, iflanked by a Clal site and an Eagl 
site can be construaed with PCR on pGKV550 as follows: 



25 Primer prtl: 

5 ' -AA GAT CT A TCG AT C TTG TTA GCC GGT ACA-3 ' = SEQ ID NO: 24 

Proteinase gene (non coding strand): 

3'-TT CCC GA T AGC TA G AAC AAT CGG CCA TGT CAG-5 ' 

Clal = SEQ ID NO: 25 

30 . 

Proteinase gene: Gin Ala Lys 

5 » -GTC GGC GAA ATC CAA GCA AAG GCG GCT-3 • = SEQ ID NO: 26 

Primer prt2: = SEQ ID NO: 27 

• 3 » -CAG CCG CTT TAG GTT CGT T GC CGG C CC CCC TTC GAA CCC- 5 * 
35 Eagl Hlndlll 
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After the PCR reaction as ciescribed in Example 6, the 98 bp long PGR fragment 
can be isolated and digested with Oa\ and HindlW. pGKV550 can subsequently be 
cleaved partially with Cia\ and completely with HindlW, after which digestions the 
vector fragment, containing the promoter, the ribosome binding site, the DNA 
5 fragment encoding the N-terminal 8 amino acids and the cell wall binding fragment 
containing the 127 C-terminal amino acids of the proteinase gene can be isolated on 
gel. 

A copy of the cholesterol oxidase gene, suitable for fusion with the prtP anchor 
domain can be produced by a PCR reaction using plasmid pUR2985 (Example 6) as 
10 template and a combination of primer choOlpcr (see Example 6) and the following 
primer cho03pcr instead of primer cho02pcr: 

choOSpcr HindlU 

3 ' -TAG TAG AGC AGG CTG TAG GTC CCA G TT CGA A CC TAG GC-5 • = SEQ ID NO: 40 

1^ til III III III III III ill 

<L-J III III III III III III III 

5 '-ATC ATC TCG TCC GAC ATC CAG = SEQ ID NO: 20. 

The about 133 kbp fragment generated by this reaction can be digested with Notl 
and HindUl to produce a molecule which can subsequently be ligated with the large 
Eagl/HindTll fragment from pUR2988 (sec Hgure 16). The resulting plasmid, 

20 pUR2989, will contain the cholesterol oxidase coding sequence inserted between the 
signal sequence and the C-terminal cell wall anchor domain of the proteinase gene. 
After introduction into Lactobacillus lactis subsp. lactis MG1363 by electroporation, 
this plasmid will express cholesterol oxidase under control of the proteinase 
promoter. The transport through the membrane will be mediated by the proteinase 

25 ' signal sequence and the immobilization of the cholesterol oxidase by the proteinase 
anchor. As it is unlikely that the Lactococcus will secrete FAD as well, the 
cholesterol oxidase'will not be active but will be capable to bind cholesterol. 



30 Example 8. Construction of a gene encoding a chimeric protein that will be 

anchored in the cell wall of a lower eukaryote and is able to bind 
gro\\1h hormones, such as the epidermal growth factor. 

For the isolation of larger amounts of human epidermal growth factor (EGF) the 
corresponding receptor can be used in form of a fusion between the binding domain 
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and a C-ierminal part of a-aggliitmin as cell wall anchor. The complete cDNA 
sequence of the human epidermal growth factor is cloned and sequenced. For the 
construction of a fusion protein with EGF binding capacity the N-terminal part of 
the mature receptor until the central 23 amino acids transmenbrane region can be 
5 utilized. 

The plasmid pUR4175 can be used for the construction. Through digestion with 
Eagl and Nhe\ (partial) a 731 bp DNA fragment containing the sequence coding for 
scFv is released and can be replaced by a DNA fragment coding for the first 621 
amino acids of human epidermal growth factor receptor. Initialing from an existing 

10 human cDNA library or otherwise through production of a cDNA library by 

standard techniques from preferentially EGF receptor overexpressing cells, e.g. A431 
.. -carcinoma cells, see Ullrich cs. (1984), further PGR can be applied for the 
generation of in frame linkage between the extracellular binding domain of the 
human growth factor receptor (amino add 1-622) and the C-terminal part of 

15 a-agglutinin. 

PGR oligonucleotides for the in frame linkage of human epidermal growth factor 
receptor and the C-terminus of a-agglutinin. 



20 a: PGR oligonucleotides for the transition between SUC2 signal sequence and the 

N-terrainus of mature EGF receptor. 

> mature EGF receptor 
- pri EGFl: Ala Leu Glu Lys Lys Val = SEQ ID NO: 28 

5'-GGG GCG_GCCjGCG CTG GAG GAA AAG AAA GTT TGC-3 ' 
^ iVOCJ. ( , , , , , , , , , , , , , , , , , , , J 

3 ' -CGC TCA Gee CGA GAC CTC CTT TTC TTT CAA ACG 5 * 

EGF rec (non-coding strand): = SEQ ID NO: 29 
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b: PGR oligonucleotides for the in frame transition between G terminus of the 
extracellular binding domain of EGF receptor and the G terminal part of 
a-agglutinin. 
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EGF rec (coding strand): 

Asn Gly Pro lie Pro Ser Ala Thr 
- 5'-AAT GGG CCT AAG ATC CCG TCC ATC GCC ACT-3 ' = SEQ ID NO: 30 

III III III ill III !!! !!! =SEQIDN0:31 

5 3»-TTA CCC GGA TTC TAG GGC AGG CGA TCG GAATTCGAA CCCC-5* 

prEGF2: Whel Hlndlll 

This fusion would result in an addition of 2 Ala amino acids between the signal 

sequence and the mature N-terminus of EGF receptor. 

The newly obtained 1.9 kbp PGR fragment can be digested with Notl and Nliel and 
10 directly ligated into the vector pUR4175 after digesting with the same enzymes, 
resulting in plasmid pUR2993 (see Figure 17), comprising the GAL7 promoter, the 
prepro-<x-mating factor sequence, the chimeric EGF receptor binding domain gene 
/ a-agglutinin 'gene, the yeast 2 \im sequence, the defective LEU2 promoter and the 
LEU2 gene. This plasmid can be transformed into 5. cerevisiae and the transformed 
15 cells can be cultivated in YP medium whereby expression of the chimeric protein 
can be induced by adding galactose to the medium. 



Example 9. Construction of genes encoding a chimeric protein anchored to the 
20 cell wall of yeast, comprising a binding domain of a "Camclidae" 

heavy chain antibody 
Recently it was described that camels as well as a number of related species (e.g. 
lamas) contain a considerable amount of IgG antibody molecules which are only 
composed of hea\7-chain^dimers, see Hamers-Casterman cs. (1993). Although these 
25 ■ "heavy-chain" antibodies are devoid of light chains, it was demonstrated, that they 
nevertheless have an extensive antigen-binding repertoire. In order to show that the 
variable regions of "this type of antibodies can be produced and will be linked to the 
exterior of the cell wall of a yeast, the following constructs were prepared. 

30 Construction of pUR2997. p UR^QqcS and dUR2999 

The about 2.1 kbp Eas\-Hin6U\ fragment of pUR4177 (Example 4. Fig 9) was 
isolated. By using PCR technology, an EcoRl restriction site was introduced 
immediately upstream of the Ea^l site, whereby the C of the EcoRI site is the same 
as the first C of the Eos;] site. The thus obtained Ec(7Rl-//mdlII fragment was 
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ligated into.plasmid pEMBL9, which was digested with EcoRl and HindUl, which 
resulted in pUR4I77.A 

The EcoRl/NIid fragment of plasmid pUR4177.A was replaced by the EcoRl/Nhel 
fragments of three different synthetic DNA fragments (SEQ ID NO: 32, SEQ ID 
5 NO: 33, and SEQ ID NO: 34) resulting in pUR2997, pUR2998 and pUR2999, 

respectively. The about 1.5 kbp BstEihHindUl fragments of pUR2997 and pUR2998 
were isolated. 



Construction of pUR4421 

10 The multiple cloning site of plasmid pEMBL9, see Dente c.s. (1983), (ranging from 
the £coRl to the HindUl site) was replaced by a synthetic DNA fragment having the 
nucleotide sequence given below, see SEQ ID NO: 35 giving the coding strand and 
SEQ ID NO: 36 giving the non-coding strand. The 5*-part of this nucleotide 
sequence comprises an £^^'1 site, the first 4 codons of a Canielidac gene 

15 fragment (nucleotides 16-27) and a Xhol site (CTTCGAG) coinciding with codons 5 
and 6 (nucleotides 28-33). The 3'-part comprises the last 5 codons of the Camelidae 
Vy gene (nucleotides 46-60) (pan of which coincides with a BsfEU site), eleven 
codons of the Myc tail (nucleotides 61-93), see SEQ ID NO: 35 containing these 
eleven codons and SEQ ID NO: 37 giving the amino acid sequence, and an £coRI 

20 site (GAATTC). The £coRI site, originally present in pEMBL9. is not functional 
any more, because the 5'- end of the nucleotide sequence contains AATTT instead 
of AATTC, indicated below as (£coRl). The resulting plasmid is called pUR4421. 
The Camelidae V„ fragment starts with amino acids Q-V-K and ends with amino 
acids V-S-S. 

25 (£coRI) Eagl Xhol BstEII 

5 ' - AATTT AG CGG CCGCCCAGGT GAAACTGCTC GAGTAAGTGA CTA AGGTCAC - 50 
3* 1 ATCGCC GGCGGGTCCA CTTTGACGAG CTCATTCACT GATTCCAGTG- 
5 Q V K 

30 -CGTCTCCTCA GAACAAAAAC TCATCTCAGA AGAGGATCTG AATTAATGAG- 100 
-GCAGAGGAGT CTTGTTTTTG AGTAGAGTCT TCTCCTAGAC TTAATTACTC- 
VSS EQ K LISE EDL N**. 

= SEQ ID NO: 37 

EcoRl Hindi! I 

35 - AATTC ATCAA ACGGTGATA -3' 119 = SEQ ID NO: 35 

-TTAAGTAGTT TGCCACTATT CGA -5' 123 = SEQ ID NO: 36 
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Construction of pUR4424 

After digesting the plasmid pB09 with X/iol and BstEU, a DNA fragment of about 

034 kbp was isolated from agarose gel. This fragment codes for a truncated 

fragment, missing both the first 4 and the last 5 amino acids of the Camelidae 

5 fragment. Plasmid pB09 was deposited as E. coU JM109 pB09 at the Centraal 

Bureau voor Schimmclcultures, Baarn on 20 April 1993 with deposition number 

CBS 271.93. The DNA and amino acid sequences of the Camel fragments 

followed by the Flag sequence as present in plasmid pB09 were given in Figure 6B 

of European patent application 93201239.6 (not ^et published), which is herein 

10 incorporated by reference. The obtained about 034 kbp fragment was cloned into 

pUR4421. To this end plasmid pUR4421 was digested with A7zoI and Hindni, after 

which the about 4 kb vector fragment was isolated from an agarose gel. The 

resulting vector was ligated with the about 034 kbp Xhol/BstElL fragment and a 

synthetic DNA linker having the following sequence: 

15 BstKLl HindlXX 

GTCACC GTCTCCTCATAATGA = SEQ ID NO: 38 

GCAGAGGAGTATTACTTCGA = SEQ ID NO: 39 

resulting in plasmid pUR4421-09. 
20 Plasmid pSY16 was digested with Eagl and HindUl, after which the about 65 kbp 
long vector backbone was isolated and ligated with the about 038 kbp Eagl/Hindni 
fragment from pUR4421-09 resulting in pUR4424. 

. Construction of pUR4482 and pUR4483 

25 From pUR4424 the about 0.44 kbp Sacl-BstEU fragment, coding for the invertase 
signal sequence and the camel heavy chain variable 09 (= CHv09) fragment, was 
isolated as well as the about 63 kbp Sacl-HindUl vector fragment The about 63 
' kbp fragment and the about 0.44 kbp fragment from pUR4424 were ligated with the 
BstEU'HindlU fragment from pUR2997 or pUR2998 yielding pUR4482 and 

30 pUR4483, respectively. 

Plasmid pUR4482 is thus an yeast episomal expression plasmid for expression of a 
fusion protein with the invertase signal sequence, the CHv09 variable region, the 
Myc-tail and the Camel "X-P-X-P" Hinge region, see Hamers-Casterman c.s. (1993), 
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(1993), and the a-agglutinin cell wall anchor region. Plasmid pUR4483 differs from 
pUR4482 in that it contains the Myc-tail but not the "X-P-X-P" Hinge region. 
Similarly, the BxtEU-HmUU fragment from pUR2999 can be ligated with the about 
6.3 kbp vector fragment and the about 0.44 kbp fragment from pUR4424, resulting 
5 in pUR4497, which will differ from pUR4482 in that it contains the "X-P-X-F Hinge 
region but not the Myc-tail. 

The pla.smids pUR4424. pUR4482 and pUR4483 were introduced into 
Saccharomyces cerevisiac SU10 by electroporation, and transformants were selected 
on plates lacking leucine. Transformants from SUlO with pUR4424, pUR4482 or 

10 pUR4483, respectively, were grown on YP with 5% galactose and analysed with 
immuno-fluorescence microscopy, as described in Example 1 of our co-pending 
. „'WO-94/0I567 (UNILEVER) published on 20 January 1994. This method was slightly 
modified to detect the chimeric proteins, containing both the camel antibody and 
the Myc tail, present at the cell surface. 

15 In one method a monoclonal mouse anti-Myc antibody was used as a first antibody 
to bind to the Myc part of the chimeric protein; subsequently a polyclonal anti- 
mouse Ig antiserum labeled with fluorescein isothiocyanate (= FITC) ex Sigma, 
Product No. F-0527, was used to detect the bound mouse antibody and a positive 
signal was determined by fluorescence microscopy. 

20 In the other method a polyclonal rabbit anti-human IgG serum, which had earlier 
been proven to cross-react with the camel antibodies, was used as a first antibody to 
bind the camel antibody part of the chimeric protein; subsequently a polyclonal anti- 
rabbit Ig antiserum labeled with FITC ex Sigma, Product No. F-0382, was used to 
detect the bound rabbit antibody and a positive signal was determined by 

25 fluorescence microscopy. 

The results in Figure 19 and Figure 20 show clearly that fluorescence can be obser- 
ved on those cells in which a fusion protein of the CHv09 fragment with the a- 
agglutinin cell wall anchor region is produced (pUR4482 and pUR4483). No . 
30 fluorescence however, was visible on the cells which produce the CHv09 fragment 
without this anchor (pUR4424), when viewed under the same circumstances. 
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SEQUENCE LISTING 

(I) GENERAL INFORMATION: 
(i) APPLICANT: 



A) 


NAME: Unilever N.V. 


d) 


STREET: Weena 455 


C) 


ciTx : Rocteraani 


E) 


COUNTRY: The Netherlands 


F) 


POSTAL CODE (ZIP): NL-3Q13 AL 


^) 


NAME: Unilever PLC 


B) 


STREET: Unilever House Blaclcfrlars 


C) 


CITY : London 


E) 


COUNTRY: United Kingdom 


F) 


POSTAL CODE (ZIP): EC4P 4BQ 


A) 


NAME: Leon Gerardus J. FRENKEN 


B) 


STREET: Geldersestraat 90 


C) 


CITY: Rotterdam 




roUNTRY* Th^ Netherlands 


F)' 


POSTAL CODE (ZIP): NL-3011 MP 


A) 


NAME: Pieter DE GEUS 


B) 


STREET: Boeier 24 


C) 


CITY: Barendrecht 




cuuiiTKz : Tne ttecnenanos 


F) 


POSTAL CODE (ZIP): NL-2991 KB 


A) 


NAME: Franciscus Maria KLIS 


B) 


STREET: Benedenlangs 102 


C) 


CITY : Amsterdam 


E) 


COUNTRY: The Netherlands 


F) 


POSTAL CODE (ZIP): NL-X025 KL 


A) 


NAME: Holger York TOSCHXA;- c/o Langnese Iglo, BR3 


B) 


STREET: Aeclcern 1 


C) 


CITYs REKEN 


E) 


COUNTRY: Germany 


F) 


POSTAL CODE (ZIP): D-48734 


A) 


NAME: Cornells Theodorus VERHIPS 


B) 


STREET: Kagedoorn 18 


C) 


CITY: Maassluis 


£} 


COUNTRY: "The Netherlands 


F) 


POSTAL CODE (ZIP): NL-3142 KB 



(ii) TITLE OF INVENTION: Immobilized proteins with specific binding 
capacities and their use in processes and products. 

(ill} NUMBER OP'SEQUENCES: 40 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(0) SOFTWARE: PatentIn Release #1-0, Version #1.25 (EPO) 
(2) INFORMATION FOR SEQ IP NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 231 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



32 



<vii) IMMEDIATE SOURCE: 

. (B) CLONE: fragment in pUR4ai9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GAATTCGACC TCATCACACA AACAAACAAA ACAAAATGAT GCTTTTGCAA GCCTTTCTTT 

TCCTTTTGGC TGGTTTTGCA GCCAAAATAT CTGCGCAGGT GCAGCTGCAG TAATGAACCA 

CGGTCACCGT CTCCTCAGGT GGAGGCGGTT CAGGCGGAGG TGGCTCTGGC GGTGGCGGAT 

CGGACATCGA GCTCACTCAG ACCAAGCTCG AGATCAAACG GTGATAAGCT T 

(2) INFORMAXIOK FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A} LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: linker Xhol-Nhel coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
TCGAGATCAA AGGCGGATCT G 
(2) ZKFORMATXOK FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: linker XhoI-Nhel non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CTAGCAGATC CGCCTTTGAT "c 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: -21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: linker Eagl-PstI coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GGCCGCCCAG GTGCAGCTGC A 
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<2) INFORMATZOK FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: linker Eagl-PstI non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GCTGCACCTG GGC ^3 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: PCR primer A (heavy chain) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGTSMARCT GCAGSAGTCW GG ^2 

(2) ZKFORMATIOK FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

<B) CLONE: PCR primer B (heavy chain) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TGAGGAGACG GTGACCGTGG TCCCTTGGCC CC 32 

(3) INFORMAXION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: PCR primer C (light chain) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GACATTGAGC TCACCCAGTC TCCA 24 



• A I ^'^t WW ■*•• 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: PCR primer D (light chain) 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GTTTGATCTC GAGCTTGGTC CC 22 

(2) INFORKATION FOR SEQ ID NO: 10 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 
<6) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vxi) IMMEDIATE SOURCE: 

(B) CLONE: linker EeoRl-PstI coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AATTCGGCCC TTCAGGTGCA GCTGCA 26 
(2) INFORMATION FOR SEQ ID NO: II: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii.) IMMEDIATE SOURCE: 

(B) CLONE: linJcer EcoRl-PstI non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GCTGCACCTG AACGGCCG . 18 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 714 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

' (B) CLONE: ScFv antitraseolide 02/01/01 
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(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 12: 



CTGCAGGACT 


CTGGACCTGG 


CCTGGTGAAA 


CCTTCTCAGT 


CTCTGTCCCT 


CACCTGCACT 


60 


GTCACTGGCT 


ACTCAATCAC 


CAG TG ATTTT 


GCCTGGAACT 


G6ATCCGGCA 


GTTTCCAGGA 


120 

^ * 




AGTGGATGGG 


CTACATAAGC 


TACAGTGGTA 


GCACTAGCTA 


CAACCCATCT 


ISO 






CACTCGACAC 


ACATCCAACA 


ACCAGTTCTT 


CCTGCACTTG 


240 


AATTCTGTGA 


CT A CTG AG G A 


CACAGCCACA 


TATTACTGTG 


CAACGTCCCT 


AACATGGTTA 


300 


CTACCTCGG A 




CTGGGG CCA A 


GGGACCACGG 


TCACCGTCTC 


CTCAGGTGGA 


360 


GGCGGTTCAG 


GCGGAGGTGG 


CTCTGGCGGT 


GGCGGATCGG 


ACATCGAGCT 


CACCCAGTCT 


420 


CCATCCTCCA 


TGTCTGTATC 


TCTGGGAGAC 


ACAGTCAGCA 


TCACTTGCCA 


TGCAAGTCAG 


480 


GACATTAGCA 


GTAATATAGG 


GTGGTTGCAG 


CAGAAACCAG 


GGAAATCATT 


TAAGGGCCTG 


540 


ATCTATCATG 


GAACCAACTT 


CGAACATCGT 


ATTCCATCAA 


GGTTCAGTGG 


CAGTGGATCT 


600 


GGAGCAGATT 


ATTCCCTCAC 


CATCAGCAGC 


CTGGAATCTG 


AAGATTTTGC 


AGACTATTAC 


660 


TGTGTACAGT 


ATGCTCAGTT 


TCCATTCACG 


TTCGGCTCGG 


GGACCAAGCT 


CGAG 


714 



(2) INFORMATION FOR SEQ ID NO; 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 734 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<vii) IMMEDIATE SOURCE: . 

(B) CLONE: ScFv anti-HCG 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 13: 



CGGCCGTTCA 


GGTCCAGCTG 


CAGGAGTCTG 


GGGGACACTT 


AGTGAAGCCT 


GGAGGGTCCC 


60 


TGAAACTCTC 


CTGTGCAGCC 


TCTGGATTCG 


CTTTCAGTAG 


CTTTGACATG 


TCTTGGATTC 


120 


GCCAGACTCC 


GGAGAA6AGG 


CTGGAGTGGG 


TCGCAAGCAT. 


TACTAATGTT 


GGTACTTACA 


180 


CCTACTATCC 


AGCCAGTGTG 


AAGGGCCGAT 


TCTCCATCTC 


CAGAGACAAT 


GCCAG6AACA 


240 


CCCTAAACCT 


GCAAATGAGC 


•AGTCTGAGGT 


CTGAGGACAC 


GGCCTTGTAT 


TTCTGTGCAA . 


300 


GACAGGGGAC 


TGCGGCACAA 


CCTTACTGGT 


ACTTCGATGT 


CTGGGGCCAA 


GGGACCACGG 


360 


TCACCGTCTC 


CTCAGGTGGA 


GGCGGTTCAG 


GCGGAGGTGG 


CTCTGGCGGT 


GGCGGATCGG 


. 420 


ACATCGAGCT 


CACCCAGTCT 


CCAAAATCCA 


TGTCCATGTC 


CGTAGGAGAG 


AGGGTCACCT 


480 


TGAGCTGCAA 


GGCCAGTGAG 


ACTGTGGATT 


CTTTTGTGTC 


CTGGTATCAA 


CAGAAACCAG 


540 


AACAGTCTCC 


TAAATTGTTG 


ATATTCGGGG 


CATCCAACCG 


GTTCAGTGGG 


GTCCCCGATC 


. 600 


GCTTCACTGG 


CAGTGGATCT 


GCAACAGACT 


TCACTCTGAC 


CATCAGCAGT 


GTGCAGGCTG 


660 


AGGACTTTGC 


GGATTACCAC 


TGTGGACAGA 


CTTACAATCA 


TCCGTATACG 


TTCGGAGGGG 


720 


GGACCAAGCT 


CGAG 










734 



36 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: " 
(AJ LENGTH: 2685 base pairs 

(B) TiTPE: nucleic acid 

(C) STRANDBDNESS: double 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

(viij IMI^EDIATE SOURCE; 

(B) CLONE: pYVlOS 

(ix) FEATURE: 

(A) NAME/KEY: COS 

(B) LOCATION: 1..2685 

(D) OTHER INFORMATION: /product^ "Flocculation protein* 

/gene- -FLOl" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATC ACA ATG.CCT CAT CGC TAT ATG TTT TTG CCA GTC TTT ACA CTT CTG 48 
Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu 
Is 10 15 

GCA CTA ACT AGT GTG GCC TCA GGA GCC ACA GAG GCG TGC TTA CCA GCA 96 
Ala Leu Thr Ser Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala 
20 25 30 

GGC CAG AGG AAA AGT GGG ATG AAT ATA AAT TTT TAC CAG TAT TCA TTG 144 
Gly Gin Arg Lys Ser Gly Met Asn He Asn Phe Tyr Gin Tyr Ser Leu 
35 40 45 • 

AAA GAT TCC TCC ACA TAT TCG AAT GCA GCA TAT ATG GCT TAT GGA TAT 192 
Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Cly Tvr 
50 55 60 

GCC TCA AAA ACC AAA CTA GGT TCT GTC GGA GGA CAA ACT GAT ATC TCG 240 
Ala Ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gin Thr Asp He Ser 
^5 70 75 80 

ATT GAT TAT AAT ATT CCjC TGT GTT AGT TCA TCA GGC ACA TTT CCT TGT 288 
lie Asp Tyr Asn He Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys 

85 90 95 

CCT CAA GAA GAT TCC TAT GGA AAC TGG GGA TGC AAA GGA ATG GGT GCT 336 
Pro Gin Glu Asp Ser Tyr Gly Asn Trp Cly Cys Lys Gly Met Cly Ala 

100 - 105 110 - 

TGT TCT AAT AGT CAA GGA ATT GCA TAC TGG AGT ACT GAT TTA TTT GGT 384 
Cys Ser Asn Ser Gin Gly He Ala Tyr Trp Ser Thr Asp Leu Phe Gly 
lis 120 125 

* 

TTC TAT ACT ACC CCA ACA AAC GTA ACC CTA GAA ATG ACA GGT TAT TTT 432 
Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe 
130 135 140 

« 

TTA CCA CCA CAG ACG GGT TCT TAC ACA TTC AAG TTT GCT ACA GTT GAC 480 

Leu Pro Pro Gin Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp 
14S 150 155 160 

GAC TCT GCA ATT CTA TCA GTA GGT GCT CCA ACC GCG TTC AAC TGT TGT 528 
Asp Ser Ala He Leu Ser Val Gly Gly Ala Thr Ala Phe Asn Cys Cys 

165 170 175 
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CCT. CAA CAG CAA CCG CCG ATC ACA TCA ACG AAC TTT ACC ATT GAC GGT 
Ala Gin Gin Gin Pro Pro lie Thr Ser Thr Asn Phe Thr He Asp Gly 
180 185 190 

ATC AAG CCA TGG GGT GGA AGT TTG CCA CCT AAT ATC GAA GCA ACC GTC 
He Lys Pro Trp Gly Gly Ser Leu Pro Pro Asn He Glu Gly Thr Val 
195 200 205 

TAT ATG TAG GOT GGC TAC TAT TAT CCA ATG AAG GTT GTT TAC TCG AAC 
Tyr Met Tyr Ala Gly Tyr Tyr Tyr Pro Met Lys Val Val Tyr Ser Asn 
210 215 220 

GCT GTT TCT TGG GGT ACA CTT CCA ATT AGT GTG ACA CTT CCA GAT GGT 
Ala Val Ser Trp Gly Thr Leu Pro He Ser Val Thr Leu Pro Asp Gly 
225 230 235 240 

ACC ACT GTA AGT CAT GAC TTC GAA GGG TAC GTC TAT TCC TTT GAC GAT 
Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Asp 

245 250 255 

GAC CTA AGT CAA TCT AAC TGT ACT GTC CCT GAC CCT TCA AAT TAT CCT 
Asp Leu Ser Gin Ser Asn Cys Thr Val Pro Asp Pro Ser Asn Tyr Ala 
260 265 270 

'GTC AGT ACC ACT ACA ACT ACA ACG GAA CCA TGG ACC GGT ACT TTC ACT 
Val Ser Thr Thr Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr 
275 280 285 

TCT ACA TCT ACT GAA ATG ACC ACC GTC ACC GGT ACC AAC GGC GTT CCA 
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Val Pro 
290 295 300 

ACT GAC GAA ACC GTC ATT GTC ATC AGA ACT CCA ACC AGT GAA GCT CTA 
Thr Asp Glu Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu 
305 310 315 320 

ATC AGC ACC ACC ACT GAA CCA TGG ACT GGC ACT TTC ACT TCG ACT TCC 
He Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser 

325 330 335 

ACT GAG GTT ACC ACC ATC ACT GGA ACC AAC GGT CAA CCA ACT GAC GAA 
Thr Glu Val Thr Thr He Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu 
- 340 345 350 

ACT GTG ATT GTT ATC A(fA ACT CCA ACC ACT GAA GGT CTA ATC AGC ACC 
Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu He Ser Thr 
355 360 365 

ACC ACT GAA CCA TGG ACT GGT ACT TTC ACT TCT ACA TCT ACT GAA ATG 
Thr Thr Glu Pro 5rp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met 
370 375 380 

ACC ACC GTC ACC GGT ACT AAC GGT CAA CCA ACT GAC GAA ACC GTG ATT 
Thr Thr Val Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu Thr Val He 
385 390 395 400 

GTT ATC AGA ACT CCA ACC AGT GAA GGT TTG GTT ACA ACC ACC ACT GAA 
Val He Arg Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu 

405 410 415 

CCA TGG ACT GGT ACT TTT ACT TCG ACT TCC ACT GAA ATG TCT ACT GTC 
Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Ser Thr Val 
420 425 430 



ACT GGA ACC AAT CCC TTC CCA ACT GAT GAA ACT GTC ATT GTT GTC AAA 
Thr Gly Thr Asn Gly Leu Pro Thr Asp Glu Thr Val He Val Val Lys 
435 440 445 
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ACT CCA ACT ACT GCC A7C TCA TCC ACT TTG TCA TCA TCA TCT TCA GGA 
Thr Pro Thr Thr Ala lie Ser Ser Ser Leu Ser Set Ser Ser Ser Cly 
450 455 460 

CAA ATC ACC AGC TCT ATC ACC TCT TCC CGT CCA ATT ATT ACC CCA TTC 
Gin He Thr Ser Ser He Thr Ser Ser Arg Pro He He Thr Pro Phe 
465 470 475 480 

TAT CCT AGC AAT GGA ACT TCT GTC ATT TCT TCC TCA GTA ATT TCT TCC 
Tyr Pro Ser Asn Gly Thr Ser Val He Ser ser Ser Val He Ser Ser 

485 490 495 

TCA GTC ACT TCT TCT CTA TTC ACT TCT TCT CCA GTC ATT TCT TCC TCA 
Ser Val Thr Ser Ser Leu Phe Thr Ser Ser Pro Val He Ser Ser Ser 
500 505 510 

GTC ATT TCT TCT TCT ACA ACA ACC TCC ACT TCT ATA TTT TCT GAA TCA 
Val He Ser Ser Ser Thr Thr Thr Ser Thr Ser He Phe Ser Glu Ser 
515 520 525 

TCT AAA TCA TCC GTC ATT CCA ACC ACT AGT TCC ACC TCT GGT TCT TCT 
Ser Lys Ser Ser Val He Pro Thr Ser Ser Ser Thr Ser Gly Ser Ser 
530 535 540 

GAG AGC GAA 'ACG AGT TCA GCT GGT TCT GTC TCT TCT TCC TCT TTT ATC 
Glu Ser Glu Thr Ser Ser Ala Gly Ser Val Ser Ser Ser Ser Phe He 
545 550 555 560 

TCT TCT GAA TCA TCA AAA TCT CCT ACA TAT TCT TCT TCA TCA TTA CCA 
Ser Ser Glu Ser Ser Lys Ser Pro Thr Tyr Ser Ser Ser Ser Leu Pro 

565 570 575 

CTT GTT ACC AGT GCG ACA ACA AGC CAG GAA ACT CCT TCT TCA TTA CCA 
Leu Val Thr Ser Ala Thr Thr Ser Gin Glu Thr Ala Ser Ser Leu Pro 
580 585 590 

CCT GCT ACC ACT ACA AAA ACG AGC GAA CAA ACC ACT TTG GTT ACC GTG 
Pro Ala Thr Thr Thr Lys Thr Ser Glu Gin Thr Thr Leu Val Thr Val 
595 600 605 

ACA TCC TGC GAG TCT CAT GTG TGC ACT GAA TCC ATC TCC CCT GCG ATT 
Thr Ser Cys Glu Ser His Val Cys Thr Glu Ser He Ser Pro Ala He. 
610 615 620 

GTT TCC ACA GCT ACT GTT ACT GTT AGC GGC GTC ACA ACA GAG TAT ACC 
Val Ser Thr Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr 
625 630 635 640 

ACA TGG TGC CCT ATT TCT ACT ACA GAG ACA ACA AAG CAA ACC AAA GGG 
Thr Trp Cys Pro He Ser Thr Thr Glu Thr Thr Lys Gin Thr Lys Gly 

^5 650 655 

ACA ACA GAG CAA ACC ACA GAA ACA ACA AAA CAA ACC ACG GTA GTT ACA 
Thr Thr Glu Gin Thr Thr Glu Thr Thr Lys Gin Thr Thr Val Val Thr 
660 665 670 

ATT TCT TCT TGT GAA TCT GAC GTA TGC TCT AAG ACT GCT TCT CCA GCC 
He Ser Ser Cys Glu Ser Asp Val Cys Ser Lys Thr Ala Ser Pro Ala 
675 680 685 

ATT GTA TCT ACA AGC ACT GCT ACT ATT AAC GGC GTT ACT ACA GAA TAC 
He Val Ser Thr Ser Thr Ala Thr He Asn Gly Val Thr Thr Glu Tyr 
690 695 700 

ACA ACA TGG TGT CCT ATT TCC ACC ACA GAA TCG AGG CAA CAA ACA ACG 
Thr Thr Trp Cys Pro He Ser Thr Thr Glu Ser Arg Gin Gin Thr Thr 
705 710 715 720 
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CTA GTT ACT GTT ACT TCC TGC CAA TCT GOT CTC TGT TCC CAA ACT GCT 
Leii Val Thr Val Thr Ser Cys GXu Ser Cly Val Cys Ser Clu Thr Ala 

725 730 735 

TCA CCT GCC ATT GTT TCC ACG GCC ACG GCT ACT GTG AAT GAT GTT GTT 
Ser Pro Ala lie Val Ser Thr Ala Thr Ala Thr Val Asn Asp Val Val 
740 745 750 

ACG GTC TAT CCT ACA TGG AGG CCA CAG ACT GCG AAT GAA GAG TCT GTC 
Thr Val Tyr Pro Thr Trp Arg Pro Gin Thr Ala Asn Glu Clu Ser Val 
755 760 765 

AGC TCT AAA ATG AAC AGT GCT ACC CCT GAG ACA ACA ACC AAT ACT TTA 
Ser Ser Lys Met Asn Ser Ala Thr Gly Glu Thr Thr Thr Asn Thr Leu 
770 775 780 

GCT GCT GAA ACG ACT ACC AAT ACT GTA GCT GCT GAG ACG ATT ACC AAT 
Ala Ala Glu Thr Thr Thr Asn Thr Val Ala Ala Glu Thr lie Thr Asn 
785 790 795 800 

ACT GGA GCT GCT GAG ACG AAA ACA GTA GTC ACC TCT TCG CTT TCA AGA 
Thr Gly Ala Ala Glu Thr Lys Thr Val Val Thr Ser Ser Leu Ser Arg 

805 810 815 

TCT AAT CAC GCT GAA ACA CAG ACG GCT TCC GCG ACC GAT GTG ATT GGT 
Ser Asn His Ala Glu Thr Gin Thr Ala Ser Ala Thr Asp Val lie Gly 
820 825 830 

CAC AGC AGT AGT GTT GTT TCT GTA TCC GAA ACT GGC AAC ACC AAG AGT 
Hts Ser Ser Ser Val Val Ser Val Ser Glu Thr Gly Asn Thr Lys Ser 
835 840 845 

GTA ACA AGT TCC GGG TTG AGT ACT ATG TCG CAA CAG CCT CGT AGC ACA 
Leu Thr ser Ser Gly Leu Ser Thr Met Ser Gin Gin Pro Arg Ser Thr 
850 855 860 

CCA GCA AGC AGC ATG GTA GGA TAT AGT ACA GCT TCT TTA GAA ATT TCA 
Pro Ala Ser Ser Met Val Gly Tyr Ser Thr Ala Ser Leu Glu lie Ser 
865 870 875 880 

ACG TAT GCT GGC AGT GCA ACA GCT TAG TGG CCG GTA GTG GTT TAA 
Thr Tyr Ala Gly Ser Ala Thr Ala Tyr Trp Pro Val Val Val 

885 890 895 

(2) INFORMATION FOR SSQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 894 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu 
15 10 15 

Ala Leu Thr Ser Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala 
20 25 30 

Gly Gin Arg Lys Ser Gly Met Asn lie Asn Phe Tyr Gin Tyr Ser Leu 
35 40 45 



Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Gly Tyr 
50 55 60 
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Ala Sec Lys Thr Lys Leu Gly Ser Val Gly Gly Gin Thr Asp lie Ser 
65 70 75 80 

lie Asp Tyr Asn lie Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys 

85 90 95 

Pro Gin Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala 
100 lOS 110 



Cys Ser Asn Ser Gin Gly lie Ala Tyr Trp Ser Thr Asp Leu Phe Gly 
lis 120 125 

Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe 
130 135 140 

Leu Pro Pro Gin Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp 
145 150 155 160 

Asp Ser Ala lie Leu Ser Val Gly Gly Ala Thr Ala Phe Asn Cys Cys 

165 170 175 

Ala Gin Gin Gin Pro Pro He Thr Ser Thr Asn Phe Thr He Asp Gly 
lao 185 190 

.^Ile Lys Pro Trp Gly Gly Ser- Leu Pro Pro Asn He Glu Gly Thr Val 
195 200 205 

Tyr Met Tyr Ala Gly Tyr Tyr Tyr Pro Met Lys Val Val Tyr Ser Asn 
210 215 220 

Ala Val Ser Trp Gly Thr Leu Pro He Ser Val Thr Leu Pro Asp Gly 
225 230 235 240 

Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Aap 

245 250 255 

Asp Leu Ser Gin Ser Asn Cys Thr Val Pro Asp Pro Ser Asn Tyr Ala 
260 265 270 

Val Ser Thr Thr Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr 
275 280 285 

ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Val Pro 
290 295 300 

Thr Asp Glu Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu 
305 310 315 320 

He Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser 

325 330 335 

Thr Glu Val Thir Thr He Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu 
340 345 350 

Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu He Ser Thr 
355 360 365 

Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met 
370 375 380 

Thr Thr Val Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu Thr Val He 
385 390 395 400 

Val He Arg Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu 

405 410 415 
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Pro .Trp Thr Gly Thr Phe Thr Ser Thr ser Thr Glu Met Ser The Val 
420 42S 430 

Thr Gly Thr Asn Cly Leu Pro Thr Asp Glu Thr Val lie Val Val Lys 
435 440 445 

Thr Pro Thr Thr Ala lie Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly 
450 455 460 

Gin He Thr Ser Ser He Thr Ser Ser Arg Pro He He Thr Pro Phe 
465 470 475 480 

Tyr Pro Ser Asn Gly Thr Ser Val He Ser Ser Ser Val He Ser Ser 

485 490 495 

Ser Val Thr Ser Ser Leu Phe Thr Ser Ser Pro Val He Ser Ser Ser 
500 505 510 

Val He Ser Ser Ser Thr Thr Thr Ser Thr Ser He Phe Ser Glu Ser 
515 520 525 

Ser Lys Ser Ser Val He Pro Thr Ser Ser Ser Thr Ser Gly Ser Ser 
530 535 540 

Glu Ser Glu Thr Ser Ser Ala Gly Ser Val Ser Ser Ser Ser Phe He 
545 550 555 560 

Ser Ser Glu Ser Ser Lys Ser Pro Thr Tyr Ser Ser Ser Ser Leu Pro 

565 570 575 

Leu Val Thr Ser Ala Thr Thr Ser Gin Glu Thr Ala Ser Ser Leu Pro 
580 585 590 

Pro Ala Thr Thr Thr Lys Thr Ser Glu Gin Thr Thr Leu Val Thr Val 
595 600 605 

Thr Ser Cys Glu Ser His Val Cys Thr Glu Ser He Ser Pro Ala He 
610 615 620 

Val Ser Thr Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr 
€25 630 635 640 

Thr Trp Cys Pro He Ser Thr Thr Glu Thr Thr Lys Gin Thr Lys Gly 

645 650 655 

Thr Thr Glu Gin Thr Thr Glu Thr Thr Lys Gin Thr Thr Val Val Thr 
660 665 670 

He Ser Ser Cys Glu Ser Asp Val Cys Ser Lys Thr Ala Ser Pro Ala 
675 680 685 

He Val Ser Thr Ser Thr Ala Thr He Asn Gly Val Thr Thr Glu Tyr 
690 695 700 

Thr Thr Trp Cys Pro He Ser Thr Thr Glu Ser Arg Gin Gin Thr Thr 
705 710 715 720 

Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys" Ser Glu Thr Ala 

725 730 735 

X 

Ser Pro Ala He Val Ser Thr Ala Thr Ala Thr Val Asn Asp Val Val 
740 745 750 



Thr Val Tyr Pro Thr Trp Arg Pro Gin Thr Ala Asn Glu Glu Ser Val 
755 760 765 
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Ser Ser Lys Met Asn Ser Ala Thr Gly Glu Thr Thr Thr Asn Thr Leu 
770 775 780 

Ala Ala Glu Thr Thr Thr Asn Thr Val Ala Ala Glu Thr He Thr Asn 
785 790 795 800 . 

Thr Gly ALa Ala Glu Thr Lys Thr Val Val Thr Ser Ser Leu Ser Arg 

805 810 815 

Ser Asn His Ala Giu Thr Gin Thr Ala Ser Ala Thr Asp Val He Gly 
820 825 830 

His Ser Ser Ser Val Val Ser Val Ser Glu Thr Gly Asn Thr Lys Ser 
835 840 845 

Leu Thr Ser Ser Gly Leu Ser Thr Met Ser Gin Gin Pro Arg Ser Thr 
850 855 860 

Pro Ala Ser Ser Met Val Gly Tyr Ser Thr Ala Ser Leu Glu He Ser 
865 870 875 880 

Thr Tyr Ala Gly Ser Ala Thr Ala Tyr Trp Pro Val Val Val 

885 890 

''(;2) INFORMATIOK FOR SEQ ZD NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) KOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: ChoB template coding strand 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GCCCCCAGCC GCACCCTCG 19 

(2) INFORMATION FOR SEQ ZD NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) KOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: ChoB t:eraplat:e non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

C6AGGGTGCG GCTGGGGGC 19 



(2) . INFORMATION FOR SEQ ID NO: 18; 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: choOlpcr primer 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGATCTGAAT TCGCGGCCGC CCCCAGCCGC ACCCTCG 

<2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(pj TOPOLOGY: linear 

• • • 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: cho02pcr primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

AGATCTAAGC TTTCAGCTAG CCTGGATGTC GGAC6AGATG AT 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: ChoB template coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

ATCATCTCGT CCGACftJCCA G 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: ChoB template non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

CTGGATGTCG GACGAGATGA T 
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(2) INFORMAXZON FOR SEQ ZD NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 
(0) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: mutagenesis primer ChoB 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGCGGCGACG GCACCGCCGT ATGCACTGGC GATGACGAGG GC 
(2) INFORMATION FOR SEQ ZD NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: ChoB template coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GCCCTCGTCA TCCGCAGTGG ATACGGCGGT GCCGTCGCCG CG 
(2) INFORMATION FOR SEQ ID KO: 24 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE; primer prtl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
AAGATCTATC GATCTTGTTA GCCGGTACA 
<2) INFORMATION FOR SEQ ID MO: 25: 

(1) SEQUENCE CHARACTERISTICS; 

(A] LENGTH: 32 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: proteinase template non'-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GACTGTACCG GCTAACAAGA TCGATAGCCC TT 
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(2) ZNFORMAXIOK FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDSDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: proteinase template coding strand 

<xi} SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GTCGGCGAAA TCCAAGCAAA GGCGGCT 27 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 
(A^ LENGTH: 39 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: prt2 primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

CCCAAGCTTC CCCCCGGCCG TTGCTTGGAT TTCGCCGAC 39 

(2) INFORMATION FOR SEQ ID NOt 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOt«CE: 

(B) CLONE: EGFl primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

GGGGCGGCCG CGCTGGAGGA AAAGAAAGTT TGC 33 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: EGF receptor template non-coding strand 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GCAAACTTTC TTTTCCTCCA GAGCCCGACT CCC 33 
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(2) ZKFORMATION FOR SEQ ZD NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEKGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: EGF receptor template coding strand 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

AATGGGCCTA AGATCCCGTC CATCGCCACT 

(2) INFORMATION FOR SEQ ID NO: 31: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: EGF2 printer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31; 

CCCCAAGCTT AAGGCTAGCG GAC6GGATCT TAGGCCCATT 

(2) XKFORMAZIOK FOR SEQ ID KO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 177 base pairs 

(B) TYPE: nucleic acid- 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: VhC - AGal linker with MycT and Hinge 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

GAATTCCAGG TCACCGTCTC CTCAGAACAA AAACTCATCT CAGAAGAGGA TCTGAATGAA 

CCAAAGATTC CACAACCTCA ACCAAAGCCA CAACCTCAAC CACAACCACA ACCAAAACCT 

CAACCAAAGC CAGAACCAGA ATCTACTTCC CCAAACTCTC CAGCTAGCCT TAAGCTT 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: VhC - AGal linker with MycT 
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. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GAATTCCAGG TCACCGTCTC CTCAGAACAA AAACTCATCT CAGAAGAGGA TCTGAATGCT 60 
AGC 63 
(2) IKFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 144 base pairs 

(B) TYPE: nucleic acid 

(C) STHANDEDNESS: double 
{D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: VhC - AGal linker with Hinge 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GAATTCCAGG TCACCGTCTC CTCAGAACCA AAGATTCCAC AACCTCAACC AAAGCCACAA 60 
CCTCAACCAC AACCACAACC AAAACCTCAA CCAAAGCCAG AACCAGAATC TACTTCCCCA 120 
AAGTCTCCAG CTAGCCTTAA GCTT 144 
(2) INFORHASION FOR SEQ ZD NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: fragment in pUR4421 coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

AATTTAGCGG CCGCCCAGGT GAAACTGCTC GAGTAAGTGA CTAAGGTCAC CGTCTCCTCA 60 

GAACAAAAAC TCATCTCAGA IVGAGGATCTG AATTAATGAG AATTCATCAA ACGGTGATA 119 

(2) INFORMATION FOR SEQ ZD MO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEUGTH:. 119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: fragment in pUR4421 non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

AGCTTATCAC CGTTTGATGA ATTCTCATTA ATTCAGATCC TCTTCTGAGA TGAGTTTTTG 60 

TTCTGAGGAC ACGGTGACCT TAGTCACTTA CTCGAGCAGT TTCACCTGGG CGGCCGCTA 119 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) T^PE: amino acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vit) IMMEDIATE SOURCE: 

(B) CLONE: Myc tail 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEONESS: single 

(D) . TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: BstEII-Hindlll linker coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

GTCACCGTCT CCTCATAATG A 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: BstEII Hindlll linker non-coding strand 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

AGCTTCATTA TGAGGAGACG- 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: (B) CLONE; primer cho03pc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

CGGATCCAAG CTTGAGCCTG GATGTCGGAC GAGATGAT 



CLAIMS 



1. A method for immobilizing a binding protein capable of binding to a spe- 
cific compound, comprising the use of recombinant DNA techniques for producing 
said binding protein or a functional part thereof still having said specific binding 
capability, said protein or said part thereof being linked to the outside of a host cell, 
whereby said binding protein or said part thereof is localized in the cell wall or at 
the exterior of the cell wall by allowing the host cell to produce and secrete a 
chimeric protein in which said binding protein or said functional part thereof is 
bound with its C-ierminus to the N-terminus of an anchoring part of an anchoring 
protein capable of anchoring in the cell wall of the host cell, which anchoring part is 
derivable from the C-terminal part of said anchoring protein. 

2. The method of claim 1, in which the host is selected from the group 
consisting of Gram-positive bacteria and fungi. 

3. The method of claim 2, in which the host is a Gram-positive bacterium 
selected from the group consisting of lactic acid bacteria, and bacteria belonging to 
the genera Bacillus and Streptomyces, 

4. The method of claim 2, in which the host is a fungus seleaed firom the 
group consisting of yeasts belonging to the genera Candida^ Debaryomyces, Han- 
'Senula, Klityveromyccs, Pichia and Sacdiaromyces, and moulds belonging to the 

genera Aspergillus, Penicillium and Rhizopus, 

5. A recombinant polynucleotide comprising 

(i) a structural gene encoding a binding protein or a functional part thereof 
still having the specific binding capability, and 

(ii) at least part of a gene encoding an anchoring protein capable of anchoring 
in the cell wall of a Gram-positive bacterium or a fungus, said part of a 
gene encoding at least the anchoring part of said anchoring protein, which 
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anchoring pan is derivable from the C-termina! pan of said anchoring 
protein. 

6. The polynucleotide of claim 5, wherein the anchoring protein is selected 
from the group consisting of a-agglutinin. a-aggluiinin, FLOl, the Major Cell Wall 
Protein of a fungus, and proteinase of lactic acid bacteria. 

7. The polynucleotide of claim 5, further comprising a nucleotide sequence 
encoding a signal peptide ensuring .secretion of the expre.ssion product of the 
polynucleotide. 

8. The polynucleotide of claim 7, wherein the signal peptide is derived from a 
• protein selected from the group con.sisting of the a-mating factor of yeast, a-agglu- 
tinin of yeast, inveriase of Saccharomyces, inulinase of Kluyvcromyces^ a-amylase of 
Bacillus, and proteinase of lactic acid bacteria. 

9. The polynucleotide of any of claims 5-8, operably linked to a promoter, 
which can be an inducible promoter. 

10. A recombinant veaor comprising a polynucleotide as claimed in any of 
claims 5-9. 

11. A chimeric protein encoded by a polynucleotide as claimed in any of 
claims 5-9. 

12. A host cell having a cell wall at the out.side of its cell and containing at 
. least one polynucleotide as claimed in any of claims 5-9. 

13. The host cell of claim 12, having at least one polynucleotide as claimed in 
any of claims 5-9 integrated in its chromosome. 
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14.. A host cell having a chimeric protein as claimed in claim 11 immobilized 
in its cell wail and having the binding protein part of the chimeric protein localized 
in the cell wall or at the exterior of the cell wall. 

15. The host cell of any of claims 12-14, which is a fungus selected from the 
group consisting of yeasis and moulds. 

16. A process for carrying out an isolation process by using an immobilized 
binding protein or functional pan thereof still capable of binding to a specific 
compound, wherein a medium containing said specific compound is contacted with a 
host cell as claimed in any of claims 12-15 under conditions whereby a complex 
between said specific compound and said immobilized binding protein is formed, 
separating said complex from the medium originally containing said specific 
compound and, optionally. relea.sing said specific compound from said binding 
protein or functional part thereof. 
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